Goto

Collaborating Authors

 memory location


Autonomous Learning with High-Dimensional Computing Architecture Similar to von Neumann's

Kanerva, Pentti

arXiv.org Artificial Intelligence

We model human and animal learning by computing with high-dimensional vectors (H = 10,000 for example). The architecture resembles traditional (von Neumann) computing with numbers, but the instructions refer to vectors and operate on them in superposition. The architecture includes a high-capacity memory for vectors, analogue of the random-access memory (RAM) for numbers. The model's ability to learn from data reminds us of deep learning, but with an architecture closer to biology. The architecture agrees with an idea from psychology that human memory and learning involve a short-term working memory and a long-term data store. Neuroscience provides us with a model of the long-term memory, namely, the cortex of the cerebellum. With roots in psychology, biology, and traditional computing, a theory of computing with vectors can help us understand how brains compute. Application to learning by robots seems inevitable, but there is likely to be more, including language. Ultimately we want to compute with no more material and energy than used by brains. To that end, we need a mathematical theory that agrees with psychology and biology, and is suitable for nanotechnology. We also need to exercise the theory in large-scale experiments. Computing with vectors is described here in terms familiar to us from traditional computing with numbers.


Crafting Personalized Agents through Retrieval-Augmented Generation on Editable Memory Graphs

Wang, Zheng, Li, Zhongyang, Jiang, Zeren, Tu, Dandan, Shi, Wei

arXiv.org Artificial Intelligence

In the age of mobile internet, user data, often referred to as memories, is continuously generated on personal devices. Effectively managing and utilizing this data to deliver services to users is a compelling research topic. In this paper, we introduce a novel task of crafting personalized agents powered by large language models (LLMs), which utilize a user's smartphone memories to enhance downstream applications with advanced LLM capabilities. To achieve this goal, we introduce EMG-RAG, a solution that combines Retrieval-Augmented Generation (RAG) techniques with an Editable Memory Graph (EMG). This approach is further optimized using Reinforcement Learning to address three distinct challenges: data collection, editability, and selectability. Extensive experiments on a real-world dataset validate the effectiveness of EMG-RAG, achieving an improvement of approximately 10% over the best existing approach. Additionally, the personalized agents have been transferred into a real smartphone AI assistant, which leads to enhanced usability.


Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers

Tanneberg, Daniel, Rueckert, Elmar, Peters, Jan

arXiv.org Artificial Intelligence

A key feature of intelligent behaviour is the ability to learn abstract strategies that scale and transfer to unfamiliar problems. An abstract strategy solves every sample from a problem class, no matter its representation or complexity -- like algorithms in computer science. Neural networks are powerful models for processing sensory data, discovering hidden patterns, and learning complex functions, but they struggle to learn such iterative, sequential or hierarchical algorithmic strategies. Extending neural networks with external memories has increased their capacities in learning such strategies, but they are still prone to data variations, struggle to learn scalable and transferable solutions, and require massive training data. We present the Neural Harvard Computer (NHC), a memory-augmented network based architecture, that employs abstraction by decoupling algorithmic operations from data manipulations, realized by splitting the information flow and separated modules. This abstraction mechanism and evolutionary training enable the learning of robust and scalable algorithmic solutions. On a diverse set of 11 algorithms with varying complexities, we show that the NHC reliably learns algorithmic solutions with strong generalization and abstraction: perfect generalization and scaling to arbitrary task configurations and complexities far beyond seen during training, and being independent of the data representation and the task domain.


Understanding Python: Part 1

#artificialintelligence

This is the first part of "Understanding Python" wherein we would be learning the foremost 5 fundamentals in python. Let us jump right into understanding each of them so that at the end of your reading, you will have basic know-how on working with python. Note: All the coding examples provided in this blog are done in the Jupyter notebook which is a web-based platform to work on python. It is highly recommended for learners who want to start with a few lines of code. Python is an object-oriented language and every object is stored in the memory whenever it is instantiated.


An Algorithm for Automatically Updating a Forsyth-Edwards Notation String Without an Array Board Representation

Iqbal, Azlan

arXiv.org Artificial Intelligence

We present an algorithm that correctly updates the Forsyth-Edwards Notation (FEN) chessboard character string after any move is made without the need for an intermediary array representation of the board. In particular, this relates to software that have to do with chess, certain chess variants and possibly even similar board games with comparable position representation. Even when performance may be equal or inferior to using arrays, the algorithm still provides an accurate and viable alternative to accomplishing the same thing, or when there may be a need for additional or side processing in conjunction with arrays. Furthermore, the end result (i.e. an updated FEN string) is immediately ready for export to any other internal module or external program, unlike with an intermediary array which needs to be first converted into a FEN string for export purposes. The algorithm is especially useful when there are no existing array-based modules to represent a visual board as it can do without them entirely. We provide examples that demonstrate the correctness of the algorithm given a variety of positions involving castling, en passant and pawn promotion.


Meltdown

Communications of the ACM

Moritz Lipp is a Ph.D. candidate at Graz University of Technology, Flanders, Austria. Michael Schwarz is a postdoctoral researcher at Graz University of Technology, Flanders, Austria. Daniel Gruss is an assistant professor at Graz University of Technology, Flanders, Austria. Thomas Prescher is a chief architect at Cyberus Technology GmbH, Dresden, Germany. Werner Haas is the Chief Technology Officer at Cyberus Technology GmbH, Dresden, Germany.


wisardpkg -- A library for WiSARD-based models

Filho, Aluizio S. Lima, Guarisa, Gabriel P., Filho, Leopoldo A. D. Lusquino, Oliveira, Luiz F. R., Franca, Felipe M. G., Lima, Priscila M. V.

arXiv.org Artificial Intelligence

In order to facilitate the production of codes using WiSARD-based models, LabZero developed an ML library C++/Python called wisardpkg. This library is an MIT-licensed open-source package hosted on GitHub under the license.


Progress Extrapolating Algorithmic Learning to Arbitrary Sequence Lengths

Robinson, Andreas

arXiv.org Machine Learning

Recent neural network models for algorithmic tasks have led to significant improvements in extrapolation to sequences much longer than training, but it remains an outstanding problem that the performance still degrades for very long or adversarial sequences. We present alternative architectures and loss-terms to address these issues, and our testing of these approaches has not detected any remaining extrapolation errors within memory constraints. We focus on linear time algorithmic tasks including copy, parentheses parsing, and binary addition. First, activation binning was used to discretize the trained network in order to avoid computational drift from continuous operations, and a binning-based digital loss term was added to encourage discretizable representations. In addition, a localized differentiable memory (LDM) architecture, in contrast to distributed memory access, addressed remaining extrapolation errors and avoided unbounded growth of internal computational states. Previous work has found that algorithmic extrapolation issues can also be alleviated with approaches relying on program traces, but the current effort does not rely on such traces.


Neural Turing Machines

#artificialintelligence

We discuss Neural Turing Machine(NTM), an architecture proposed by Graves et al. in DeepMind. NTMs are designed to solve tasks that require writing to and retrieving information from an external memory, which makes it resemble a working memory system that can be described by short-term storage(memory) of information and its rule-based manipulation. Compared with RNN structure with internal memory, NTMs utilize attentional mechanisms to efficiently read and write an external memory, which makes them a more favorable choice for capturing long-range dependencies. But, as we will see, these two are not independent of each other and can be combined to form a more powerful architecture. The overall architecture of NTM is demonstrated in Figure 1, where the controller is a general neural network, an MLP or RNN, which receives inputs and previous read vectors and omits outputs in response.


Memory-Augmented Neural Networks for Machine Translation

Collier, Mark, Beel, Joeran

arXiv.org Machine Learning

Memory-augmented neural networks (MANNs) have been shown to outperform other recurrent neural network architectures on a series of artificial sequence learning tasks, yet they have had limited application to real-world tasks. We evaluate direct application of Neural Turing Machines (NTM) and Differentiable Neural Computers (DNC) to machine translation. We further propose and evaluate two models which extend the attentional encoder-decoder with capabilities inspired by memory augmented neural networks. We evaluate our proposed models on IWSLT Vietnamese to English and ACL Romanian to English datasets. Our proposed models and the memory augmented neural networks perform similarly to the attentional encoder-decoder on the Vietnamese to English translation task while have a 0.3-1.9 lower BLEU score for the Romanian to English task. Interestingly, our analysis shows that despite being equipped with additional flexibility and being randomly initialized memory augmented neural networks learn an algorithm for machine translation almost identical to the attentional encoder-decoder.